Execution and Cache Performance of the Scheduled Dataflow Architecture

نویسندگان

  • Krishna M. Kavi
  • Joseph Arul
  • Roberto Giorgi
چکیده

This paper presents an evaluation of our Scheduled Dataflow (SDF) Processor. Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative architectures. This trend allows for better performance at the expense of an increased hardware complexity and a brute-force solution to the memory-wall problem. Our research substantially deviates from this trend by exploring a simpler, yet powerful execution paradigm that is based on dataflow concepts. A program is partitioned into functional execution threads, which are perfectly suited for our non-blocking multithreaded architecture. In addition, all memory accesses are decoupled from the thread’s execution. Data is pre-loaded into the thread’s context (registers), and all results are post-stored after the completion of the thread’s execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to enabled threads. The analytical analysis of our architecture showed that we could achieve a better performance than other classical dataflow architectures (i.e., ETS), hybrid models (e.g., EARTH) and decoupled multithreaded architectures (e.g., Rhamma processor). This paper analyzes the architecture using an instruction set level simulator for a variety of benchmark programs. We compared the execution cycles required for programs on SDF with the execution cycles required by the programs on DLX (or MIPS). Then we investigated the expected cache-memory performance by collecting address traces from programs and using a trace-driven cache simulator (Dinero-IV). We present these results in this paper. Category: Processor Architectures, Performance of Systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Execution And Cache Performance Of A Decoupled Non-Blocking Multithreaded Architecture

In this paper we will present an evaluation of the execution performance and cache behavior of a new multithreaded architecture being investigated by the authors. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are p...

متن کامل

Execution Performance of the Scheduled Dataflow Architecture (SDF)

This paper presents an evaluation of a nonblocking, decoupled memory/execution, multithreaded architecture known as the Scheduled Dataflow (SDF). Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative designs. This trend allows for better performance at the expense of increased hardware complexity, and possibly higher power expe...

متن کامل

Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation

ÐIn this paper, the Scheduled Dataflow (SDF) architectureÐa decoupled memory/execution, multithreaded architecture using nonblocking threadsÐis presented in detail and evaluated against Superscalar architecture. Recent focus in the field of new processor architectures is mainly on VLIW (e.g., IA-64), superscalar, and superspeculative designs. This trend allows for better performance, but at the...

متن کامل

Design of cache memories for dataflow architecture

The recent advance in dataflow processing — to combine the dataflow paradigm with the control flow paradigm — has brought out many new challenging issues. This hybrid organization has made it possible to study and adapt familiar control flow concepts such as cache memories within the framework of the dataflow architecture. The concept of cache memory has proven its effectiveness in the von Neum...

متن کامل

Comparing Execution Performance of Scheduled Dataflow With RISC Processors

In this paper we describe a new approach to designing multithreaded architecture that can be used as the basic building blocks in high-end computing architectures. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. UCS

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2000